homomorphism count
Caterpillar GNN: Replacing Message Passing with Efficient Aggregation
Message-passing graph neural networks (MPGNNs) dominate modern graph learning. Typical efforts enhance MPGNN's expressive power by enriching the adjacency-based aggregation. In contrast, we introduce an efficient aggregation over walk incidence-based matrices that are constructed to deliberately trade off some expressivity for stronger and more structured inductive bias. Our approach allows for seamless scaling between classical message-passing and simpler methods based on walks. We rigorously characterize the expressive power at each intermediate step using homomorphism counts over a hierarchy of generalized caterpillar graphs. Based on this foundation, we propose Caterpillar GNNs, whose robust graph-level aggregation successfully tackles a benchmark specifically designed to challenge MPGNNs. Moreover, we demonstrate that, on real-world datasets, Caterpillar GNNs achieve comparable predictive performance while significantly reducing the number of nodes in the hidden layers of the computational graph.
Homomorphism Counts as Structural Encodings for Graph Learning
Bao, Linus, Jin, Emily, Bronstein, Michael, Ceylan, İsmail İlkan, Lanzinger, Matthias
Graph Transformers are popular neural networks that extend the well-known Transformer architecture to the graph domain. These architectures operate by applying self-attention on graph nodes and incorporating graph structure through the use of positional encodings (e.g., Laplacian positional encoding) or structural encodings (e.g., random-walk structural encoding). The quality of such encodings is critical, since they provide the necessary $\textit{graph inductive biases}$ to condition the model on graph structure. In this work, we propose $\textit{motif structural encoding}$ (MoSE) as a flexible and powerful structural encoding framework based on counting graph homomorphisms. Theoretically, we compare the expressive power of MoSE to random-walk structural encoding and relate both encodings to the expressive power of standard message passing neural networks. Empirically, we observe that MoSE outperforms other well-known positional and structural encodings across a range of architectures, and it achieves state-of-the-art performance on a widely studied molecular property prediction dataset.
Towards Bridging Generalization and Expressivity of Graph Neural Networks
Li, Shouheng, Geerts, Floris, Kim, Dongwoo, Wang, Qing
Expressivity and generalization are two critical aspects of graph neural networks (GNNs). While significant progress has been made in studying the expressivity of GNNs, much less is known about their generalization capabilities, particularly when dealing with the inherent complexity of graph-structured data. In this work, we address the intricate relationship between expressivity and generalization in GNNs. Theoretical studies conjecture a trade-off between the two: highly expressive models risk overfitting, while those focused on generalization may sacrifice expressivity. However, empirical evidence often contradicts this assumption, with expressive GNNs frequently demonstrating strong generalization. We explore this contradiction by introducing a novel framework that connects GNN generalization to the variance in graph structures they can capture. This leads us to propose a $k$-variance margin-based generalization bound that characterizes the structural properties of graph embeddings in terms of their upper-bounded expressive power. Our analysis does not rely on specific GNN architectures, making it broadly applicable across GNN models. We further uncover a trade-off between intra-class concentration and inter-class separation, both of which are crucial for effective generalization. Through case studies and experiments on real-world datasets, we demonstrate that our theoretical findings align with empirical results, offering a deeper understanding of how expressivity can enhance GNN generalization.
Homomorphism Counts for Graph Neural Networks: All About That Basis
Jin, Emily, Bronstein, Michael, Ceylan, İsmail İlkan, Lanzinger, Matthias
A large body of work has investigated the properties of graph neural networks and identified several limitations, particularly pertaining to their expressive power. Their inability to count certain patterns (e.g., cycles) in a graph lies at the heart of such limitations, since many functions to be learned rely on the ability of counting such patterns. Two prominent paradigms aim to address this limitation by enriching the graph features with subgraph or homomorphism pattern counts. In this work, we show that both of these approaches are sub-optimal in a certain sense and argue for a more fine-grained approach, which incorporates the homomorphism counts of all structures in the ``basis'' of the target pattern. This yields strictly more expressive architectures without incurring any additional overhead in terms of computational complexity compared to existing approaches. We prove a series of theoretical results on node-level and graph-level motif parameters and empirically validate them on standard benchmark datasets.
Structural Node Embeddings with Homomorphism Counts
Wolf, Hinrikus, Oeljeklaus, Luca, Kühner, Pascal, Grohe, Martin
Graph homomorphism counts, first explored by Lov\'asz in 1967, have recently garnered interest as a powerful tool in graph-based machine learning. Grohe (PODS 2020) proposed the theoretical foundations for using homomorphism counts in machine learning on graph level as well as node level tasks. By their very nature, these capture local structural information, which enables the creation of robust structural embeddings. While a first approach for graph level tasks has been made by Nguyen and Maehara (ICML 2020), we experimentally show the effectiveness of homomorphism count based node embeddings. Enriched with node labels, node weights, and edge weights, these offer an interpretable representation of graph data, allowing for enhanced explainability of machine learning models. We propose a theoretical framework for isomorphism-invariant homomorphism count based embeddings which lend themselves to a wide variety of downstream tasks. Our approach capitalises on the efficient computability of graph homomorphism counts for bounded treewidth graph classes, rendering it a practical solution for real-world applications. We demonstrate their expressivity through experiments on benchmark datasets. Although our results do not match the accuracy of state-of-the-art neural architectures, they are comparable to other advanced graph learning models. Remarkably, our approach demarcates itself by ensuring explainability for each individual feature. By integrating interpretable machine learning algorithms like SVMs or Random Forests, we establish a seamless, end-to-end explainable pipeline. Our study contributes to the advancement of graph-based techniques that offer both performance and interpretability.
Expectation-Complete Graph Representations with Homomorphisms
Welke, Pascal, Thiessen, Maximilian, Jogl, Fabian, Gärtner, Thomas
We investigate novel random graph embeddings that can be computed in expected polynomial time and that are able to distinguish all non-isomorphic graphs in expectation. Previous graph embeddings have limited expressiveness and either cannot distinguish all graphs or cannot be computed efficiently for every graph. To be able to approximate arbitrary functions on graphs, we are interested in efficient alternatives that become arbitrarily expressive with increasing resources. Our approach is based on Lov\'asz' characterisation of graph isomorphism through an infinite dimensional vector of homomorphism counts. Our empirical evaluation shows competitive results on several benchmark graph learning tasks.